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DNA loop formation is one of several mechanisms used by organisms to regulate genes. 
The free energy of forming a loop is an important factor in determining whether the as- 
sociated gene is switched on or off. In this paper we use an elastic rod model of DNA to 
determine the free energy of forming short (50-100 basepair), protein mediated DNA loops. 
Superhelical stress in the DNA of living cells is a critical factor determining the energetics 
of loop formation, and we explicitly account for it in our calculations. The repressor protein 
itself is regarded as a rigid coupler; its geometry enters the problem through the boundary 
conditions it applies on the DNA. We show that a theory with these ingredients is sufficient 
to explain certain features observed in modulation of in vivo gene activity as a function of 
the distance between operator sites for the lac repressor. We also use our theory to make 
quantitative predictions for the dependence of looping on superhelical stress, which may be 
testable both in vivo and in single-molecule experiments such as the tethered particle assay 
and the magnetic bead assay. 

PACS numbers: 87.14.Gg, 87.15. La, 82.35. Pq, 36.20.Hb 



I. INTRODUCTION AND SUMMARY 
I. A. Introduction 

Many genetic processes in bacteria are controlled by proteins that bind at separate, often widely 
spaced, sites on DNA and hold the intervening double helix in a loop For example, 

the lactose metabolism system in E. coli is controlled by a repressor protein, LacI, binding to 
a set of binding sites. Early evidence for the existence of a looping mechanism came from the 
observation that the ability of a cell to control a particular gene depended in an approximately 
periodic way upon the number of basepairs of DNA intervening, between two particular protein- 
binding sequences (called "operators") (see for example [l|, S, @, Q]). Some recent data appear 
in Fig. [TJ The periodic modulation was found to be roughly independent of the details of what 
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FIG. 1: a. Repression of a gene product controlled by the lac repressor in E. coli cells. The data are from [8J, Figure 
3a; see that paper for an explanation of the units on the vertical axis. The horizontal axis gives the distance along 
the DNA between the centers of the two operators, each 21 basepairs long. In this paper we will express operator 
spacing by a number L that equals this number minus 21bp (Fig. [2}. Other experiments have obtained similar curves 
using operators located on a plasmid Q|. b. Looping free energy inferred from the data in (a), showing a roughly 
periodic modulation with operator spacing (from [101 ] . Figure 3). The maxima of this function correspond to poor 
looping efficiency, that is, to the minima in panel (a). There is a slight minimum in the lower envelope of this function 
around 70-80bp, corresponding to our L ~ 50-60bp. A similar function emerges from the more detailed analysis of 
Garcia et al. II il l. 



basepair sequence was inserted or deleted between the operators; insertions and deletions elsewhere 
did not affect the gene regulation in this way. 

The interpretation of these results followed an analogy to the related process of DNA cyclization. 
Suppose that a regulatory protein binds stereospecifically to the two operators, forcing them into 
close physical proximity, with the intervening DNA forming a loop (Fig. [2]). The equilibrium 
constant for this isomerization reaction depends on the free energy change, which contains as a 
component the elastic energy cost of forming the loop. The elastic energy, in turn, contains terms 
reflecting bending and twisting deformation. For a favorable value of the interoperator spacing, 
loop formation involves only bending of the DNA. For spacing differing slightly from the optimum, 
however, bringing the loop ends into the relative orientation imposed by the protein complex 
requires an additional rotation of one end about its axis. The extra elastic energy cost entailed 
by this deformation reduces the equilibrium constant for loop formation relative to the optimal 
spacing. But if the spacing is increased by a full helical turn Lh e ii x (about 11 basepairs 1 ), then 

1 Although the canonical value of DNA pitch Lheiix is quoted as 10.5bp, this value in fact depends on the temperature, 
solution conditions, superhelical stress and so on. In this paper we will use the value llbp as an approximation to 
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FIG. 2: a. Crystallographically derived structure of the lac repressor (LacI, dark gray) bound to two operator 
segments of DNA (black) The light gray curve represents a DNA conformation interpolating between the 

operator segments, obtained in Ref. [14|. b. Cartoon of the class of loops we will study. The DNA is considered 
free in the region between the two exit points Sj and Sf. These exit points are located at ±10.5 bp from the operator 
centers. Within the binding sites themselves, the protein may induce kinks in the DNA, as shown, c. Definition 
of the initial and final tangent vectors ti, if, the separation vector a, and the angle 9 a characterizing our idealized 
DNA-protein complex, a is the vector joining the two exit points, located at arclengths Si and Sf . The arclength 
separation between exit points is L = si — sj. The vectors ti, if, and a are all assumed to be be coplanar, and the 
angle 9 a from a to —if is equal to that from —a to ti (the "planar, symmetric coupler" idealization). In the example 
shown, 9 a > 90°. Although the coupler is planar, the loop itself will not in general be so, as illustrated here. d. The 
★-loop configuration corresponding to (c) (see text Sect. IIILCjl , This loop is always planar. 



we once again have a twist-free loop solution, a relatively low elastic energy cost, and hence a 
relatively high level of gene regulation. Thus the hypothesis of loop formation predicts a periodic 
modulation of the regulatory efficacy with loop size, as observed. Mossing and Record put forward 
a version of this theory shortly after the first experimental results [l^ . 

Later, looped DNA complexes similar to those inferred by the above argument were seen directly 



in electron microscopy (e.g. [6]) and other modalities. More recently, single- molecule experiments 
have demonstrated DNA looping in vitro, and enabled the systematic study of the looping reaction 
as a function of external parameters such as stretching force (15I . On the structural side, 

advances in x-ray crystallography have yielded structures for the operator-protein complex, for 



example in the lac operon system 13j, Il7l . isl |. Starting with that work, many authors have 



sought to determine the detailed form of the loop using physical modeling (see Sect. III.Aj) . A 
more ambitious goal would be to predict the looping free energy function, which has recently been 



phenomeno 





ogically extracted from experimental studies of gene repression in vivo (for example 



20|; see Fig. []]&). This paper is intended as a step in that direction. 



I.B. Goals of this work 



Our goal in this paper is to introduce one important physical aspect of looping, relevant both in 
vivo and in single-molecule assays. This is the presence of significant torsional stress (super coiling) 



the actual period. 
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in the region of DNA outside the loop-forming tract. Certainly everyday experience teaches that 
external torque can predispose an elastic rod (such as a garden hose) to form a loop. 

A simple estimate shows that this external stress can significantly alter the equilibria between 
the unlooped state and various alternative looped states. As we will review later, bacteria maintain 
their chromosomal DNA in a negatively supercoiled (undertwisted) state, with a local torsional 
stress M ext of about —4 pN nm (see Sect. III. A.2p . Formation of a loop can relax the external DNA by 
an angle on the order of ±7r radians, allowing the external torsional stress to do work ~ ±7rM cxt ~ 
±3 k-gT on the looping complex. This energy scale is comparable to the looping energies inferred 
from data (Fig. Q]&), so we must account for it. Indeed, previous authors have already documented 



a large effect of super coiling on a re 
on a long circular DNA molecule 21 



ated process, the juxtaposition of sequentially distant points 



22|. We wish to study similar effects in a simple way, in the 



context of DNA looping. Specifically, we will calculate, in a simplified model, the quasiperiodic 
dependence of the looping free energy (Fig. Q]&) on the interoperator spacing L. 

We also give a procedure to find, in an idealized physical model, the equilibrium shapes and 
energies of an elastic rod under the sort of end constraints appropriate to DNA loop formation by 
a protein complex. Our method uses the explicit analytic solutions to the elastic-rod equations, 
and hence enjoys significant computational advantages over gradient-descent algorithms. 

Our results show that indeed external torque affects looping equilibria, and can change which 
of multiple looped states is most favorable. Specifically, the shape of the looping free energy 
curve reflects exchanges of stability as L increases; the critical values of L for these exchanges 
(local maxima of Fig. [T]6) depend on the external torsional stress. These results can be tested, 
for example in vivo by studying bacteria with varying levels of supercoiling density ([23], section 
2.II.D), or in vitro by the methods of Lia et al. [la ]. The methods developed in this paper may 
also be applicable to other systems where DNA loops are implicated Q]. 

The paper is organized as follows. Sect. [TT] outlines some prior work and sets out the many 
simplifications we introduce to keep the treatment of external supercoiling relatively transparent. 
Sect. Mil gives more details of our calculation strategy. Sect. |IV] presents the actual calculation, and 
Sect. |V] discusses the results. Expert readers wishing to see the key new elements of our approach 
will find them in Sects. IIITllHlIXDl and WB\ 

The Appendix gives a glossary of symbols used in the text. 



II. PHYSICAL FRAMEWORK 



In the first subsection below we describe some of the physical ingredients that enter into the 
problem of modeling DNA looping. It is not possible to survey here the large literature on such 
models, but we will mention some of the prior work incorporating these ingredients. Mainly we 
discuss work on the lac system, but extensive work has also been done on other regulatory systems, 
such as gal (e.g. and lambda (e.g. [3]), and on the binding of nucleosomes to miniplasmids 
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(e.g. |2£ 



II. A. Ingredients of the problem and prior work 



II.A.l. Loop structure 



The crystallographic work of Lewis and coauthors gave only the structure of the regulatory 
protein complex bound to two short DNA segments containing the operators. Following this work, 
several authors used elastic models of DNA to propose structures for the complete looped state 
(for example, Q, 0, 27, 28, 29, 3ol . l3l| K (Earlier authors studied similar mathematical problems 
in other contexts, for example, [3 21 . |33J). The basic premise of these works is that the regulatory 
protein complex binds to two specific elements on the DNA, with a fixed, specified orientation 
relative to it (the "strong anchoring end condition" [26(]). The DNA between the two binding 
regions must accommodate to these constraints by adopting a form different from the one it would 
otherwise have taken; finding this form is a boundary-value problem in the elasticity of a slender 
body. These works included varying levels of realism in their treatment of the DNA elasticity: 
Some included bend anisotropy, sequence dependence, and electrostatic effects. Some, however, 
neglected DNA twist stiffness altogether, and so could not address the periodic phasing dependence 
that is part of our main motivation. 

Several authors have recognized that there may be alternate DNA binding patterns, giving rise 



to multiple looping states (for example 2Q. 124, l34j|). We discuss this further in Sect. HT7B. 31 below. 

In addition to purely elastic effects, it has long been recognized that the conformation of DNA 
is critically affected by chain entropy. An early calculation including these effects was Shimada 
and Yamakawa's study of DNA cyclization, the formation of circular DNA from linear pieces; later 
work has extended and refined their results 3a, [36|, [37J . Recent work on DNA looping has begun to 
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38, 



39J]. Although 



incorporate entropic corrections following a similar calculational approach 
these corrections can be significant, for short loops the strong anchoring condition constrains the 
DNA so much that elastic effects dominate over conformational entropy (at least for understanding 
the periodic phasing dependence that is our central concern). 

Other calculations have acknowledged that the protein complex formed in DNA looping may 
not be a rigid object; stresses transmitted from the bent DNA may distort the protein, or even 
induce major conformational changes in it 
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aa I4ai4i|. 



II. A. 2. External supercoiling 



In the absence of external constraints and thermally-induced deformation, DNA would be a 
double helix with helical pitch -Lhclix ~ 11 bp ~ 3.7 nm. We define a corresponding quantity 
cjq = 27r/Lheiix) the angular rate at which the two strands orbit their common centerline. 
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Closed circular DNA isolated from bacteria generally shows negative supercoiling 23J]. This 
supercoiling is expressed as the fraction a by which the total linking number differs from the value 
Ltot^o/2vr appropriate for a torsionally relaxed, circular loop; thus bacteria have a < 0. The value 
of a can vary with the life conditions (e.g. temperature) of the cell; it can vary from cell to cell 
and with the division cycle of a single cell; and even within a single cell, at one moment, there may 
effectively be domains of different a 231 ] . 

Moreover, the topological linking number is not simply related to the quantity of interest to 
us, which is the torsional stress M ex t. First, in the bacterial cell various DNA-binding proteins 
can effectively absorb some linking number, similarly to the role of histones in eukaryotes. This 
binding results in a reduced effective value a e g (sometimes called the "superhelical stress" ) in the 



range of —2.5% to —5% [23j, |42J, |43J, [4J]. (Interestingly the corresponding value for eukaryotes is 
close to zero [23].) 

Second, even a e g partitions into two components, corresponding to twist and writhe. Only the 
twist component, roughly one quarter of the total [45j], gives rise to torsional stress M ex t. We 



estimate M ext using Hooke's law, M ext = K tw Auj, where K tw ~ TOnm/ce? 1 is the twist stiffness 
of DNA under zero tension and Au; = {\<J e f[)ujQ ~ — 0.017/nm from the above estimates. Thus 
Afext ~ — 1^B?\ within the wide uncertainties implied by the preceding paragraphs. In particular, 
the dispersion in M ex t values implies that the observed repression curve (Fig. [JJa) will be an average 
over a distribution of M ext values. 

None of the prior work mentioned in Sect. III. A. II introduced external torsional stress (super- 
coiling) quantitatively; that is the goal of the present work. This neglect is justified when studying 
loop formation in open (linear) DNA segments. Even in the context of a circular, supercoiled 
DNA, the strong anchoring condition implies that the interior of the loop is unaffected by external 
torsional stress (if we neglect possible stress-dependent deformation of the protein). Hence for 
any given looped state, the geometric shape of the loop does not depend on this stress. However, 
supercoiling does alter the equilibrium among the different looped states, and between them and 
the unlooped state, and hence it will affect curves such as those in Fig. [TJ 

Swigon and coauthors do discuss the role of supercoiling qualitatively [3J]. As mentioned earlier, 
Vologodskii and coauthors also studied its effect on site juxtaposition, in a large Brownian dynamics 
simulation 22]. We seek a framework for looping calculations in which such effects can be modeled, 



at least approximately, without recourse to such large computations. 



II. B. Framework and idealizations used in this paper 



We will make many simplifying assumptions in this paper, in order to focus more clearly on 
effects of interest to us. Some of these assumptions are already familiar from others' earlier work. 
Taken together, these simplifications preclude detailed quantitative comparison with experimental 
data like those in Fig. [TJ But the lessons we learn can be readily transferred to more detailed 
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models. 



II.B.l. DNA 



Although bending anisotropy, sequence dependence, nonlinear DNA elasticity |46|, l47l |. and 
perhaps even strand separation are likely to be important to the quantitative understanding of 
loop formation, we neglect them all. That is, we treat DNA as a continuous, inextensible, isotropic 
elastic rod, with a linear relation between stress and the resulting strain (the Bernoulli-Euler 
approximation Eq. [23]) . We will also neglect electrostatic effects, or more precisely, assume that 
they can be effectively incorporated via effective values of the DNA bend stiffness and the binding 
constants for the protein. The advantage of these simplifications is that they will let us use the 
elegant closed- form solutions to the elastic equilibrium equations (Sect. [TVl) . 

Although the free DNA is assumed to be long, and so has significant configurational entropy, 
as mentioned earlier we will neglect fluctuations of the DNA inside the loop, and their entropy, 
because we are interested in short loops. The ideas advanced in this article can be applied to more 
elaborate calculations involving chain entropy effects. 

We will assume that within the loop, we may neglect self-contact of the DNA. Thus we can 
only find the simplest one or two topoisomers in any given situation, because higher topoisomers 
are generally stabilized by self-avoidance. However, at least at moderate values of the external 
supercoiling, the higher topoisomers have very high elastic energy and may indeed be neglected. 

Finally, we will assume that there are no other DNA-binding proteins in the system that can 
bind to the loop region, altering its elasticity or even imposing sharp bends on the DNA. In fact, at 
least one such protein was present in the experiment of Fig. [H namely the heat unstable nucleoid 
protein (HU). But similar data can be obtained from mutant bacteria that are missing particular 
proteins (e.g. HU [9(]), and in any case in vitro assays can also be performed with no other proteins 
present. 



II. B. 2. Protein 



The repressor protein complex, like any protein, is flexible: It can deform under stress, and in 
the case of LacI can even pop into an extended conformation. We will neglect these effects, treating 
the protein complex as a rigid jig, or clamp, which we will call the "coupler" 2 . The geometry of 
the coupler is independent of the length of the DNA between the two operators. 



2 This simplifying assumption may be more realistic for other complexes, such as the lambda cl repressor, which are 
thought to be more rigid than LacI. 
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II. B. 3. DNA-protein binding 

The LacI protein complex is a tetramer with two binding sites for DNA. Each binding site can 
bind strongly to specific operator sequences, or more weakly to generic DNA, or not at all. Bintu 
and coauthors have argued that for LacI, in vivo, both sites are nearly always bound to DNA; the 
strong binding to a few specific sites competes with the weak binding to many generic sites [ll|, [jjj. 
We will instead simplify by assuming that the protein consists of two halves, each permanently 
bound to their operator sites. The looping reaction then consists of these dimers finding and 
binding to each other, thus imposing a fixed relative orientation on their bound operator DNA. 3 

The specific binding of LacI at each site is thought to have a two-fold degeneracy arising from 
the symmetry of each dimer: The operator DNA may be reversed in direction and still bind 



equally well. This degeneracy leads to four competing loop states [20 . We will neglect this 



complication, assuming that only a single DNA orientation is allowed at each binding site. (The 



binding orientations we choose produce what is often called the "parallel loop" state [3J].) The 
equilibrium between distinct binding orientations can be handled by the same methods as those 
used in the present paper for the equilibrium between different looped states. 

The geometry of the lac repressor complex is known to be chiral. Thus even in the absence 
of any external torsional stress, the protein complex itself predisposes the DNA to loop with a 
particular helical sense. One contribution to this chirality comes from the fact that in the cartoon 
of Fig. [2]c, the arrows representing the required DNA tangent vectors do not lie in the plane of 
the figure, but instead tilt slightly into the page on their right sides, by an angle often called (5 



341 ] . We will neglect this effect, assuming that the two boundary conditions correspond to tangent 
vectors in the same plane as the separation vector between the detachment points (/3 = 0). We 
call this assumption the "planar coupler" condition. The methods of this paper can be extended 
to handle the case of nonzero (5. Note that even with a planar coupler, the DNA loop itself need 
not, and generally will not, be planar. Thus in the structure sketched in Fig. [2]c, the DNA will not 
in general contact itself in the middle of the loop. 

Protein binding generally bends DNA, and in some cases untwists it as well. Because we treat 
the protein as permanently bound to each operator, we need not worry about these effects, as long 
as the entrance points 5i, Sf (Fig. [2^), and their corresponding tangent vectors, also lie in the same 
plane as the one just mentioned. We thus add this requirement to our "planar coupler" condition. 

There are two other sources of chirality (besides nonzero (5 and protein-induced unwinding 
mentioned above), which we do retain: First, as mentioned above, the axial orientations of the two 
binding sites can induce a nonplanar equilibrium shape for the DNA loop, even if the coupler obeys 
the planar condition. Also, of course, any external supercoiling introduces another chiral ingredient 
into the problem. We believe that these two effects are more important for the qualitative structure 



3 Again, this idealization may be more literally appropriate for other repressors, such as lambda cl. 
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of Fig. Q] than the twist angle /3, and in any case they are the effects that we have chosen to study 
in this paper. 

We also assume that the angle 9 a shown in F ig. [H e equals the corresponding angle on the left 
side (the "symmetric planar coupler;" see also [32|). Our choice is motivated by approximate 



symmetries actually observed in crystallographic data on protein-DNA complexes [131 . |48|. The 
angle 6 a may have a different effective value in solution from the one observed in crystallographic 
structures, so we will treat it as an unknown parameter in our analysis. However, we take the 
separation a between the exit points to have a fixed value 4.0 nm roughly equal to that seen in the 
crystal structure [13]. 



III. CALCULATION STRATEGY 
III. A. Mathematical representation 



We represent a thin elastic rod as a curve in space (the rod's centerline), together with a set 
of orthonormal triads at each point on the curve (the "physical frame"). The third vector of each 
triad, ^(s), is chosen to coincide with the tangent to the curve at arclength location s. We may 
choose ei(s) to point from the centerline toward the major groove of the DNA at position s, and 
&2(s) to complete the triad. Thus for relaxed DNA in the absence of thermal motion, as s increases 
63 (s) points in a constant direction while ei^s) rotate about it a constant angular rate loq equal 
to 2tt radians per helical turn. In general we say that a rod has zero excess twist if the momentary 
rate of rotation of its physical frame has 3-component equal to ujq. 

For many purposes, it is convenient to replace the physical frame given above by an "untwisted 
frame" obtained by rotating the physical frame at each point about e^^s) by the angle — cvqs. We 
will denote the untwisted frame by dj(s), and use it in the calculations of Sect. HV.Bi 

We represent the coupler mathematically as a condition specifying the relative spatial locations 
and physical frames of the DNA as it exits the two binding sites and enters the loop region (see 
Fig. [2D. That is, stereospecific binding to the protein complex requires that the location and 
orientation at positions Sj and Sf are related by a fixed element of the Euclidean group E(3). In 
particular, this relation is independent of the interoperator spacing L. 

We can express the same condition in the untwisted frame {dj}. Now the relation between 
frames at si and Sf does depend on L, but in a simple way: Compared to the physical frame, the 
required final orientation has an additional rotation about 63 (sf), by —loqL. 

For certain special values of L, we will be able to meet the coupler's boundary condition in a 
very simple way, with a loop that stays in the plane determined by the coupler and has zero excess 
twist. These values take the form 



L = L + j-khelix, 



(1) 
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FIG. 3: a. Equilibria between the unlooped state U and various looped states n, n' . We have omitted the regulatory 
proteins, indicating the operator sites by tick marks. We wish to treat the system as a small subsystem of interest 
(inside dashed line), thermodynamically coupled to a large reservoir (outside). Each looped state differs from the 
others by a 2n rotation of one strand of the DNA about its axis at its binding site. Thus, although the total linking 
number is the same for every state shown, nevertheless the set of looped states divides into classes labeled by an 
integer. The two looped states shown have the same elastic energy inside the dashed lines, but quite different total 
free energy changes because of the torsional stress arising from supercoiling outside the dashed lines, b. One way to 
distinguish topological classes of looped states is to choose a standard reference arc C that completes each of the 
looped configurations, then compute the linking numbers of the resulting closed loops. 

where j is any integer and Lq is a constant depending on the coupler geometry. For generic values 
of L, however, any loop must either writhe out of the plane, or have twist density different from 
ujq, or both. 



III.B. Why the problem is conceptually difficult 



Suppose that we are studying looping in a large, closed DNA molecule (-L tot = thousands of 
basepairs), with a particular small separation between the operator sites (L = dozens of basepairs). 
We divide all states into broad classes, or "looping states" (Fig. [3]): those that are unlooped, and a 
set of looped states. The fraction of time spent in the unlooped state determines the level of gene 



19l | . and is in turn determined by the relative free energies of the various states 49]. 



repression 

The transitions between looping states do not change the total linking number of the full DNA 
molecule. Nevertheless, there is a topological distinction between the classes of looped states, which 
allows us to refer to them as "topoisomers." To see this, imagine clipping out the looped regions 
of the looped states inFig. [3]a. The strong anchoring condition implies that we can find a small 
reference arc C such that each such clipped region can be completed to a continuous closed loop 
by gluing in the same piece C (Fig. [3b). After this operation we can calculate the linking numbers 
of the two resulting small closed loops, which will in general differ by an integer. 

Clearly the equilibrium between the looped and the various unlooped states will be affected by 
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the initial degree of supercoiling in the molecule. We would like to treat the region outside the 
looping region as a "reservoir" and characterize it by a "thermodynamic force" acting on the loop 
region. To see why this is not entirely straightforward, we contrast to an easy problem: Suppose we 
have a small box of air in contact with a large room via a pinhole. For the purposes of calculating 
the average number of gas molecules in the box (Aq), we can forget about the size and shape of 
the room, instead characterizing it by a single number, the pressure. The rest of the calculation is 
easy because there is a local, additive conservation law relating N\ to the number N2 of molecules 
in the room, and because the boundary between the two subsystems is fixed. In contrast, in our 
problem the linking number, although conserved, is not locally defined, and the two operator sites 
are free to move in the unlooped state. 

III.C. The *-loop state 

III.C.l. Decomposition of the free energy change 

Our problem would be easier if we had only to investigate the equilibrium between various 
looped states, not that between looped and unlooped states! After all, a direct transition between 
the states n and n 1 in Fig. [3]a simply involves an axial rotation by 2ir. In the limit where the total 
DNA length L tot 3> L, the external torsional stress M ext is constant during this process, so the 
exterior region does work on the looping region given by 27rM ex t. Adding this work to the change 
in elastic energy gives the total free energy change of the transition. 

To see how to extend the above remark to include loop formation, we found it useful to introduce 
a fictitious looping state, which we call the *-state, and to divide the free energy change of looping 
into two pieces: That for the transition from unlooped to the *-state, and that for a subsequent 
transition to the desired physical looped state. 

The *-state is characterized by a modified ★-coupler, identical to the actual coupler except 
for the axial orientation it imposes on the outgoing DNA, which is always chosen to admit a 
planar, untwisted loop regardless of L. One such loop is a non-selfcontacting solution to the elastic 
equilibrium equations; we call it the ★-loop configuration (see Fig. [2]d). 

Thus we write the free energy change for formation of looped state n as 

^G p en ^i 00 p n = AG p CI m + AG^^ioop^ (2) 

We wish to calculate each term on the right. In fact, the second term can be evaluated by the 
same method as outlined in the first paragraph of this subsection. We now turn to discuss the first 
term. 
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III.C.2. -k -loop formation free energy 

In the unlooped state, the full circular DNA is freely fluctuating. It has a certain free en- 
ergy, which we assume to be extensive (at least over the small length changes we are studying): 
G un (L tot , a) = L tot //(a"), where the free energy density /i depends on the fractional degree of 
supercoiling a. We imagine cutting the DNA, introducing a full extra unit of linking number, 
and resealing it. Examining the resulting change of free energy yields a formula for the external 
torsional stress M ex t: 

^=w M ext . (3) 

We now turn to loop formation. The *-loop is planar and untwisted. Thus its formation not 
only reduces the length of the remaining free exterior region from L tot to L to t — L; it also expels 
some linking number from the looped region, changing a to a 1 = a L tot / (L tot — L) « cr(l + (L/L tot )). 
Neglecting higher orders in L/Ltot, the corresponding change of free energy is thus 

AGbind + G un (L tot — L, a') - G un (L tot ,a) ki AGbind + (-^tot - L)fi(a + - — ) — Gf rec (L tot ,a) 

Mot 

« AGbind + L(-fj,(a) + au; M ext ) 

« AGbind - i/i(0) (4) 

Here AGbind is the binding free energy for assembly of the protein complex 4 . The total free energy 
change AG O p 0n ->* is the quantity in Eq. H] plus the elastic energy £+ of the *-loop (recall that we 
neglect the conformational entropy of the looped regions). 

The free energy G un of supercoiled DNA has been investigated both theoretically and experi- 
mentally [44|. Rather than attempting to evaluate it explicitly, we now just observe that, Eq. [J] is 
a fixed, linear function of L; it does not contribute to the quasiperiodic modulation seen in Fig. [H 
and it does not depend on which looped state n we will eventually form. We can drop these parts 
of the free energy change if we understand that our result will be correct only modulo the addition 
of some linear function of L to our calculated free energy change of looping. This limitation does 
not impair our ability to predict the periodic modulation of the free energy change, nor to find 
nonlinear behavior such as a dip in its envelope at a particular value of L, nor to investigate the 
equilibrium between competing topoisomers (various n), nor to explore the a dependence of the 
looping free energy. 

Again, henceforth we will drop the contributions to the looping free energy given by Eq. U or 
in other words we take AG open _, t = £+ in Eq. [2j 

4 Recall that we are assuming that the DNA is permanently bound to the protein. More generally we need to 
account for the fact that protein-DNA binding generally unwinds the DNA; for example in lac the unwinding is 
nearly one radian. The work done by external torsional stress against this rotation effectively modifies the binding 
constant relative to the value appropriate for isolated operator fragments. 
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III.D. Prom *-loop to physical looped states 

For the special values L = Lq + jLh e ii x mentioned in Sect. IIILA} the *-state coincides with one 
of the physical looped states. For other values of L, the *-state is a useful intermediate, because 
as we have seen its formation has a simple effect on the DNA outside the looping region, and so 
does the passage from it to the actual looped states. 

Our procedure, then, is the following. We begin by choosing starting guesses for the unknown 
parameter Lq describing the periodically spaced special values of L, and the poorly known param- 
eter 9 a . We set reasonable values for the remaining parameters M ext ~ —lk^T, a ~ 4.0 nm and 
for the elastic constants of DNA. 

We then step through the various values of L in the range of interest. For each L, we construct 
the ★-loop (Sect. HV.B. II below) as the planar, non-selfintersecting loop that meets all the boundary 
conditions imposed by the coupler except for axial orientation, and solves the elastic equilibrium 
equations. We call its elastic energy 

If L is one of the special values, then the *-loop is one of the possible physical looped states. 
Whether or not this is true, we next perturb the *-loop through a family of nonplanar solutions to 
the elastic rod equilibrium equations, maintaining always the same relative position and tangents 
at the ends (Sect. IIV.B.4I below). Each solution in this family has a final orientation differing 
from the *-loop by an axial rotation. The corresponding rotation angle ip is a real number (not 
ambiguous modulo 2ir). Each time ip passes through a value 

ip n = (L — Lq)luq + 27m for an integer n, (5) 

we get a physical looped state. The angle ip n may be either positive or negative (or zero if L 
takes one of the special values). We compute the elastic energy of this loop and call it £ n . For 
each physical loop found, we correct the energy to £' n = £ n — ip n M ext to account for the external 
torsional stress, obtaining AG^^i oopn = £ n — ip n M ext — £+. 

The quantity £+ cancels when we compute the total free energy change (Eq. [2]); as described in 
Sect. IIII.C.21 we also drop the linear contribution Eq. HI Thus 

^G O p en ^i 0O p n = £ n = £ n i/j n M ex t, (6) 

modulo the addition of a fixed, linear function of L. 

Eq.[B]embodies one of the main points of this paper. We can understand it physically by thinking 
about Fig. [3]a: The two looping states n and n' have the same elastic energy, but one is favored 
and the other disfavored by external torsional stress. The correction term in Eq. [6] incorporates 
this effect. 

In general, we will only obtain one or two solutions in this way for each value of L; as mentioned 
earlier, higher topoisomers are stabilized by self-contact, which our model neglects. We now plot 
each energy value £' n versus its L. Taking the smallest £' n for each L gives a graph (Fig. [5] below). 
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FIG. 4: The geometry of our 2D exercise. The contour length of the DNA in the loop is L. The size of the protein 
complex is represented by a; its geometry is summarized by the parameter Q a . a and 9 a determine the boundary 
conditions for the boundary value problem for the geometry of the loop. The element at arclength s from the center 
exerts a force F on the element at s + ds. Due to the assumed symmetry, F points along the negative z-axis as 
shown. F is also the sideways force exerted on the ends of the rod by the protein complex. 

Finally, we repeat the whole procedure with different values of the parameters Lq and 9 a , and seek 
values that are reasonable and that roughly reproduce experimental data like those in Fig. [U 

IV. CALCULATION DETAILS 

We now give details of the calculation outlined in the previous section. 

IV. A. Two dimensional warmup problem 

As a warmup problem, we consider the analogous elasticity problem in two dimensions. That 
is, we find the profile z(s) and y(s) of a planar elastic loop (Fig. H|) where s denotes the arc-length 
along the loop with the origin s = placed midway along the contour. Our equations will be 
simple because twist elasticity plays no role in two dimensions. 

The boundary value problem for the loop can be stated as 

K hend 9" + F sinO = 0, 9(0) = 0, #(■§)= vr + a , (7) 

where i^bend is the bending modulus of the elastic rod, 9(s) is the angle from the positive z-axis 
to the tangent and F is an unknown force acting along the z-axis and exerted by the end-clamp 
on the rod. Primes denote differentiation with respect to the arclength s. 

The solution to the second order differential equation above is well known [50| and can be 
written as 



9(s) = 2 a m(^-\k), 



(8) 
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where am(x|A;) is the elliptic function of the first kind with modulus k. A = y K ^nd anc j ^ are 
independent parameters, to be determined from the boundary data. Thus 

COS e(s) = ^ = l-2sn 2 (^|£0, (9) 

sin6>(s) = ^ = 2sn(^\k)cn(^\k). (10) 

We can integrate these equations and obtain the following solution for z(s) and y(s), which are the 
coordinates of the material point denoted by s on the rod: 



z(s) 



s-2 I sn 2 (-^-|£;)da, (11) 

Jo *K 

y(s) = J\ S n(^-\k)cn(^\k)da=^(l-dn(^\k)). (12) 

Fig. H] shows a typical solution from this family. 

The two constants A and k can be determined in terms of the given a and 9 a by imposing the 
boundary conditions, leading to the following two equations: 

v + e a = 0(|) = 2am(^|AO, (13) 



a 
2 



¥ 2 l - 2 <^> d - («) 

We denote y p = and eliminate A in favor of y p , obtaining 

TT + 6 a = 2&m(y p \k), (15) 
y p {l-j) = 2 J S n 2 ((3\k)dP = ¥ (y p -E(y p \k)), (16) 

where E(y p \k) = Jq P dn 2 (x\k)dx is the incomplete elliptic integral of the second kind with modulus 
k. Once we solve numerically for y p and k from these equations we can obtain the unknown force 
F as 

r _ -Kbend _ ^-Kbcndl/p/c 2 



A 2 L 2 

Also, the bending moment M applied by the protein at s = h is given by 



(17) 



M = K hend 6'(±) = 4if bend ^ A / 1 - k 2 cos 2 (18) 



Finally we calculate the elastic energy stored in the loop. 

*L/2 



SeUO(s)} = [ (^*e' 2 (s)-Fcos9(s))d S 

= F £ / l(^ dn2( i |fc>+2s,i2( i |fc> - i ) d5 



L/2 

L / 2 f 2 .\ 2 

-L/2 



k 2 l )= FL ^- l) - (19) 
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Similar formulas have appeared in earlier work (e.g. [33]). 

It is well known that the equations describing the shape of a bent rod are similar to the equations 
of motion of a pendulum and that this analogy can be utilized to obtain rod shapes corresponding 
to different regimes [50 ) . In the above we looked at the solution corresponding to the revolving 
orbits of the pendulum. We can also have solutions corresponding to oscillating orbits of the 
pendulum. In that case the solution is given by 

cos6(s) = l-2k 2 sn 2 (^-\k). (20) 
A 

Corresponding to this solution we find that 

F= 4K^ M= 4^d^ 2 _ cog2 | ; £elas [9(s)] = FL(2k*-l). (21) 

IV. B. Elasticity theory: 3D calculation 

IV.B.l. -k-loop 

The *-loop is by definition a planar, untwisted solution of the elastic equilibrium problem with 
given separation and tangent vectors at the ends. As such its centerline coincides with the 2D 
solution found in Sect. IIV.AI above. Its physical frame has e 3 and ei always in the yz plane, and 
£2 along x. 

IV. B. 2. Rod equilibrium 

We now repeat our exercise for a uniform, inextensible, isotropic elastic rod with twist stiffness, 
not necessarily confined to a plane. We again idealize the protein complex forming the loop as a 
rigid object attaching to two specified points (representing specific binding sites) on the rod. We 
assume that the length of the vector connecting one binding site to the other is given, and we call 
it a = I a| . The orientation of the physical frame attached at one site relative to the one attached 
at the other site, as well as the orientation of a relative to either of those frames, are also assumed 
to be given. The derivatives of the untwisted frame vectors {dj(s)} as the arc-length s changes 
contain information about the local curvature and torsion of the rod: 

d- = k x dj, for £ = 1,2,3. (22) 

We define ^1,2,3 as the components of the curvature vector k(s) expanded in the frame {dj}. 

The moment (or torque) M(s) at any point on the rod is assumed to have the Bernoulli-Euler 
form 



M = K hend Kidi + isr bend K 2 d2 + K tw K 3 d 3l 



(23) 
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where -Kbend an d K tw are the bending and twisting moduli of the elastic rod. 5 Equivalently we can 
specify the persistence lengths £b = Khend/kftT and £ t w = K^/koT, where k^T is the thermal 
energy scale. The equilibrium equation for the rod is then simply given by force balance, F' = 0, 
and by torque balance: 

M' + d 3 x F = 0. (24) 

As in Fig. [U F(s) is the force each element exerts on the next; it is also the force applied by the 
protein on the DNA. 



Following Nizette and Goriely 5CJ], we will assume that the laboratory coordinate frame is 
chosen in such a way that the constant internal force F is aligned with the laboratory z-axis: 
F = Ft,. We write the position vector P(s) of any point on the loop as [X(s),Y(s), Z(s)] or in 
cylindrical coordinates as [R(s), 3>(s), Z(s)]: 

P(s) = X(s)x + Y(s)y + Z{s)i 

= R(s) cos $(s)x. + R(s) sin $(s)y + Z(s)z. (25) 

Because we assume that our protein complex obeys the symmetric coupler condition (Sect. HT.B.3p . 
we will be interested in loops that are also symmetric in a way that generalizes Fig. 2) Specifically, 
we will find suitable equilibrium solutions satisfying our boundary conditions, which also obey 

X(s) = -X(-s), Y(s) = Y(-s), Z{s) = -Z{-s). (26) 

Eqs. 1261 reduce to our planar form when X(s) = 0. It may appear to treat the X and Y directions 
asymmetrically, but what is meant is that the solutions of interest can be brought to this form by 
translation and rotation about z. Thus for example, if the loop is planar then we agree to place it 
in the yz-plane, with the center point s = at the origin. Eqs. [26] suggest that <&(s) will take the 
form $(s) = § — a(s) with a(—s) = —a(s), and indeed our solutions will have this property. 

IV. B. 3. Boundary conditions 

We are now in a position to formulate the boundary condition describing the relative position 
of the ends of the loop. Our first condition again imposes a separation of length a: 

||P(lr)-P(-f)|| 2 = a 2 , (27) 

where again L is the loop contour length. Taking account of the symmetry of the loop, this 
statement amounts to 

R 2 { L )cos 2 H L ) + z 2 { L ) = ^ (28) 



5 Note that the external torque, M ex t, need not equal M3 at the points si, Sf, because the coupler can exert torque 
on the DNA. 
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Notice that our choice of loop orientation implies that the end-to-end vector of the loop a lies in 
the xz-plane, though not in general along the z-axis as in Fig. HI 

The tangent vector at any point on the rod is given by t(s) = |jp77^jj] • Explicitly, 

P'O) = ( J R / (s)cos$(s)-i?(s)$'(s)sin$(s))x+(i? , (s)sin$(s)+i?(s)$'(s)cos$(s))y+Z / (s)z (29) 

which can be rewritten as 

t(s) = P'(s) = T(s) cos c/)(s)x + T(s) sin (/)(s)y + cos6»(s)z (30) 

where we define 6(s), (p(s) and T(s) through 

cos9(s) = Z'(s), (31) 
tan^(s) = tan f$(s) + tan' 1 — ^j^y-^ > ( 32 ) 
T 2 (s) = R' 2 {s) + R 2 (s)& 2 (s) = l-Z' 2 {s). (33) 

The last equation reflects the inextensibility of the rod. 

Our planar coupler requires that the vectors a, P'(— ■§) and P'(-§) all lie in a common plane, 
even if the full loop is not planar. Accordingly, we will seek solutions satisfying a second boundary 
condition: 

a-(P'(-f)xP'(f)) = 0. (34) 

Using the assumed symmetry of the shape of the loop, we see that this boundary condition can be 
satisfied if either 

or 

i?'(f ) sin $(f ) + i?(f )$'(f ) cos = 0. (36) 

Substituting Eq. [3U]into Eq. 1321 shows that the second of these conditions would imply 0(±-|) = 
0. This in turn would imply that the end tangents P'(±^) are parallel and lie on the xz-plane. 
This violates the assumed geometry of the coupler depicted in Fig. [2j The end tangents need not 
be parallel. So in this section we pursue only the condition represented by Eq. [35l which allows 
for loops with the desired coupler geometry. 

The third boundary condition that we need to satisfy involves the angle at which the DNA exits 
the protein complex. Generalizing Eq. [71 we will require that 

a-P'(f) = acos# a . (37) 
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Together Eqs. [271 1351 and 1371 can be recast as the boundary conditions 

Z\\) 2cos# a 

a 2 



(38) 



I?(%)coe t *ti) + Z\%) = T , (39) 
^r, 1/r , 2cos<9 a 

$'(§)tan$(§) = -. (40) 



We will solve these equations starting from the most general solution to the differential equations 
for the equilibrium of a bent and twisted rod. The solution yields analytical expressions for R(s), 
Z(s) and $(s), which will be substituted in the above to obtain algebraic equations. Nizette and 
Goriely [f3(| give the solution in terms of four parameters Ci,2,3 and A as 

Z(s) = C 3 s-X(C3-Ci)E(j\k), (41) 

R 2 (s) = 2X 2 (h- Z'(s)), (42) 
~r / \ A /,, s M 3 — M z h^.s. , ,\ 7r , 4 . 

where II is the elliptic integral of the third kind and 

M 3 = (V(i + Ci)(l + C2XC3 + 1) + V(i " OKI - C2XC3 - 1)) , (44) 



\/2A 

M z = ^ (V(i + Ci)(i + C 2 )(C3 + i)- v / (i-Ci)(i-C 2 )(C3-i)) , (45) 



L Ci + C2 + Cs - C1C2C3 + VC 1 - Ci 2 )(l " C 2 2 )(C 3 2 - 1) , (46) 



2 

C2 — Cl ,2 C2 — Cl C2 - Cl / 

The quantities M3 and M z turn out to be the components of the moment vector along d 3 and z, 
respectively [5^ . 

We must now fix the parameters Ci,2,3 and A by imposing boundary conditions. In addition 
to Eqs. [381 - 140 [ we also need an expression for how the frames at each end of the rod are oriented 
with respect to each other. To formulate such an expression, note that for any choice of a and 9 a 
there will be one non-selfintersecting loop solution with zero excess twist — the *-loop. Taking 
this as a reference configuration, any other equilibrium solution with the same a and 9 a and the 
same initial di(— ^) = di jre f(— -|) will have a final orientation di(+|r) differing from di jre f(+^) 
by a rotation about t(+-|). We need to find the corresponding rotation angle xfj, then impose the 
condition Eq. [5j 

The angle ip, divided by 2tt, may be regarded as a linking number difference, generalized to 
the case of open curves. To evaluate it, we need a generalization of the Fuller- White-Calugareanu 
relation ALk = ATw + AWr for open curves. We start with the *-loop, with its untwisted frame. 
Next we construct a plane, untwisted, circular arc C, attached to the two ends of the *-loop to 
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form a closed, smooth, framed curve. Completing the *-loop in this way gives a closed loop with 
Tw = Wr = 0. Also, the tangents at the ends of C match the tangents at the ends of any of the 
family of loops we are studying. 

Completing any other loop in our family with the same C gives a discontinuity in the axial 
orientation at one of the attachment points. Hence the formula for linking number will not give 
an integer; instead, 2-7rLk is the desired formula for ip. Setting it equal to one of the desired values 
(Eq. [5]) gives our fourth boundary condition. 

In summary, our fourth boundary condition is ip = (L — Lq)ujq + 2irn for an integer n, where 
fill 

ip = 2vrLk = 2vr(Tw + Wr) 

= /_* »sd, + i f fi(s) ■ i(s') x npfflpffn, * is'. (48) 

Here <f . . . ds refers to a line integral over the closed loop consisting of the arc C plus the open loop 
under study. 



IV. B. 4-. Solution strategy 

There are four parameters in the above equations: d, (2, C3 an d A. As in Sect. IIV.A1 we must 
find values for these parameters by imposing the boundary conditions. These four parameters can 
be determined by enforcing the boundary conditions (Eqs. [271 ESI G3 and[M|). This leads to a set 
of equations that must be solved numerically, using Newton's method. Our initial guess for £1, (2, 
C3 and A for given boundary data corresponds to the values of these parameters for a planar loop. 
For example, we know how to solve for k and A for a planar loop (for which ip = 0) of length L, 
end-separation a and end- angle 8 a . For a three dimensional loop with similar data (but ij) 7^ 0) 
we initialize Newton's method with the guess Ci = — 1> C2 = 2/c 2 — 1, (3 = 1. We then make a 
small increment in ijj and solve a system of four algebraic equations to obtain the nearby values 
of £1, £2, C3 an d A that give this value of if). We continue this incremental process until we have 
achieved one of the values of ip n dictated by the contour length L between the repressor binding 
sites (Eq. [SJ. This numerical procedure corresponds physically to holding a rod into a planar 
loop and subsequently rotating one end, continuously changing the shape of the loop. Once we 
have computed this solution, the elastic energy stored in the DNA is evaluated using the following 



expression (see 50]): 



FL r 



f 



Ci + C 2 + Cs - C1C2C3 - v /(i-C 1 2 )(i-C|)(C 3 2 -i)l + t^- (49) 

Then we continue the incremental search both forward and backward in ip looking for other topoi- 



somers. 
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V. RESULTS 

Our goal is to capture certain broad features of the looping free energy change as obtained 
from experiments in the analysis of Refs. [lcl . [ill ] (Fig. Q]&). Beyond the gross structure, there is 
a shallow minimum in the looping free energy change, in the 70-80bp range. Keeping in mind 
that the horizontal axis of the graph differs from our L by 21 bp, this minimum corresponds to 
L ~ 50-60 bp. In contrast, for loops formed in cyclization reactions 52j the minimum in free 
energy change occurs at about 500bp [3]. We will see that our elastic rod model, incorporating 
supercoiling effects and a highly simplified form of the geometry of the repressor-DNA complex, 
does reproduce some qualitative features in the length dependence of the free energy change. To 
do this, we now apply the methods of analysis outlined in the previous sections. 

Sect. |Tl] described the idealizations we have made to keep the role of external torsional stress as 
clear as possible. These idealizations limit our ability to make quantitative predictions for specific 
systems, but nevertheless we will apply our method using parameters partially inspired by the 
structure of the lac repressor complex. Thus, we estimate the distance between the two ends of 
the DNA to be a ~ 4.0 nm. We estimate M ext = — l.QksT (see Sect. III. A.2j) . and take effective 
values for the elastic constants from experiments on the cyclization of short DNA 6 : £b = 50 nm, 



£ tw = 18 nm 52|. Finally, we make initial guesses 9 a = 71° and Lq = for the unknown parameters. 

Fig. shows the free energy of loops with M ex t = —LOksT. This result shows that a simple 
elastic rod model of the DNA is able to capture the general trend in the modulations of the free 
energy. The amplitude of the modulations (about 4/cbT) is correctly reproduced and the maxima 
of the modulations are sharp, as found from experimental data by Saiz et al. [lOj]. The curve also 
shows a dip in free energy at about 45 bp, fairly close to the dip in the experimental data. No such 
dip is seen in the free energy of looping for nicked (non- twist storing) DNA, so its appearance is 
influenced by the external torque due to supercoiling. 

Fig. [6] shows the looping free energy change as a function of length for the three values 
M cx t = —1,0, and +lk^T, illustrating our point that the exchanges of stability that determine 
the dominant topoisomer at a given value of L depend on the magnitude and sign of the external 
torsional stress. This is easily seen by comparing panels (a) and (c) of Fig. which show that 
changing the sign of M ex t shifts the maxima of the modulations by about 4 bp. The sign of M ext 
can in principle be controlled in an ,n vitro experiment, such as a magnetic bead assay (employed 
in Lia et al. [la]). As a result the preference for a particular topoisomer of a loop will change and 



6 The twist stiffness given here is smaiier than the vaiue found in singie-moiecuie studies. Widom and Cioutier 
found that this small value was needed to fit their data on cyclization of short DNA, and proposed that it was an 
effective value reflecting a nonlinear breakdown of elasticity under high strain Previous authors also found 

that a significant, though less dramatic, reduction of the value of twist stiffess was needed to fit cyclization of 
longer DNA [35] . In the present work we found that a reduced effective twist stiffness was needed to get the 
required magnitude of the peak-to- valley energy change in Fig. 0>; however, Saiz and Vilar have argued that the 
existence of multiple looping geometries can also reduce this modulation [201 ]. 
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FIG. 5: a. The elastic energy of DNA loops as calculated from an elastic rod model of DNA. We have assumed 
9 a = 71° and a — 4.0nm with £ p = 50nm, £t = 18nm and M GXt = — l.OknT. The graph shows the quantity defined 
in Eq. [5] plus an arbitrary linear function of L, because such a function was dropped in our derivation of Eq. |S] 
The exchange of stability between various topoisomers at the peaks of the modulations (40,52,63,74bp) has been 
emphasized by plotting the energy of the two competing topoisomers with different symbols (circles and stars) . The 
continuous line connects the lowest energy topoisomers at each value of the length L of the loop. b. The black 
curve plots the elastic energy of a planar loop without any twist, as calculated using Eq. 1191 This curve would be 
appropriate for looping with nicked DNA. 

this will result in an altered dependence of the looping free energy on the contour length. Fig. [6] 
summarizes how this dependence will be altered for zero torque and a positive torque. Fig. [6] has 
been constructed for the geometry of the lac repressor but the conclusion that the magnitude and 
sign of the external torque M ex t controls the locations of the minima and maxima of the modu- 
lations in the free energy of loop formation remains valid for any other DNA looping protein as 
well. 

VI. DISCUSSION 

We have shown in this paper that an elastic rod model of DNA can explain certain the features 
observed in the length dependence of in vivo DNA looping free energy, if we account for the size 
and shape of the repressor-DNA complex, and for the external torsional stress from supercoiling 
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FIG. 6: Predicting the effect of changing the sign and magnitude of M cxt . Again we took 9 a = 71° and a = 4.0nm 
with = 50nm, £ tI1) = 18nm and plotted the energy for three values of M ext . All three panels have the same, 
arbitrary, linear function of L added to the looping free energy change. The sign of the assumed torque M cxt in panel 
(c) is the opposite of that in panel (a). The minima and maxima in panel (c) are shifted by about 4 bp as compared 
to those in panel (a). 

in the bacterial chromosome. These features have not been adequately examined in the theoretical 
literature, although they are critical in determining the function of the repressor-DNA complex. 
We have also obtained predictions that could elucidate the mechanics of protein-DNA interactions, 
and we hope that they will inspire future experiments. One can adjust M ey ± in vivo by studying 
bacteria with varying levels of supercoiling density, for example by disabling the topoisomerases 
that normally maintain the genome under torsional stress ([23]). Or the present theory can be 
generalized to incorporate the effects of stretching force; then a magnetic bead assay could be 
used to test our prediction that the locations of the minima and maxima in the modulations of 
the free energy of loop formation can be controlled by the applied external torque. An important 
limitation of our theory, as presently stated, is that it does not address entropic effects and therefore 
is applicable only for short contour lengths of DNA. However, the efficient numerical approach 
developed in this paper remains applicable for longer contour lengths as well and could be used in 
conjunction with Monte Carlo methods or Molecular Dynamics to explore the effects of entropy 
on the mechanics of protein DNA interactions. 
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APPENDIX A: SUMMARY OF NOTATION 

-Kbend, K tw are the bend and torsional elastic constants of DNA; the corresponding persis- 
tence lengths are £b and £tw 

ujq is the natural twist rate of DNA, about 2ir radians per llbp. Lhelix is the helix pitch of 
DNA, about llbp. 

Sj, Sf are the arclength positions at which the DNA exits its binding sites and enters the loop 
interior; s;, Sf are the corresponding positions where the DNA enters the exterior region. 
L = s\ — Sf is the spacing between exit points; Lq is a special value of this spacing for which 
the coupler admits a planar, untwisted loop. 

{e,(s),i = 1,2,3} denote the physical orthonormal frame attached to the DNA at arclength 
position s; {dj(s)} is the corresponding untwisted frame. 

K i, 2,3(5) denote components of the curvature vector at location s, measured in the untwisted 
frame. 

t = 63 = d3 is the tangent vector to the DNA centerline, as a function of arclength position 
along the DNA. 

The "coupler" refers to a geometrical representation of a regulatory protein complex, impos- 
ing a fixed relation between the spatial locations and physical orientations of two points on 
the DNA. It is independent of the spacing L. A "physical loop configuration" is one obeying 
the boundary conditions imposed by the coupler. 

The "^-coupler" is a fictitious, modified coupler differing from the physical one by an In- 
dependent axial rotation of one of the DNA strands relative to the other. Quantities asso- 
ciated to it are denoted by a subscript *. The "*-loop configuration" is the loop of minimal 
elastic energy obeying the boundary conditions imposed by the ^-coupler, 
a is the spatial separation between DNA detachment points; a denotes its length. 
6 a is the exit angle characterizing the regulatory protein complex. 
(3 twist angle of the DNA-protein complex, set equal to zero in this paper 

M cxt torsional stress outside the looping region, same units as energy; Mi^^{s) are the 
components of the moment (torque) vector inside the loop at arclength position s, expressed 
in the untwisted frame. Note that M ext 7^ M3 in general. 

<?* denotes the elastic energy of the lowest-energy *-loop configuration; £ denotes the elastic 
energy of a physical looped state. 

n indexes which of the possible physical loop states is under discussion; U is the unlooped 
state. 

ifj n is the axial rotation angle by which physical loop n differs from the *-loop. 
G mi (L tot ,a) denotes the free energy of an unlooped circular DNA of length L to t with 
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fractional excess linking number a; fj,(a) is the corresponding free energy per unit length. 

• am, sn, cn are the usual elliptic functions, k is the modulus of an elliptic function. E(y\k) 
is the incomplete elliptic integral of the second kind; H(y\n, k) is the elliptic function of the 
third kind. 

• Ci,2,3 are parameters entering the general elastic equilibrium solution in 3D. 

R(s), 3>(s), Z(s) are cylindrical coordinates for the position of the rod at arclength position 
s. 

9(s),(f)(s) are spherical polar coordinates for the unit tangent vector to the rod at position 
s. 
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